CIRCUMSPECT DESCENT PREVAILS IN SOLVING 
RANDOM CONSTRAINT SATISFACTION PROBLEMS 



MIKKO ALAVA, JOHN ARDELIUS, ERIK AURELL, PETTERI KASKI, 
SUPRIYA KRISHNAMURTHY, PEKKA ORPONEN, AND SAKARI SEITZ 

Abstract. We study the performance of stochastic local search 
algorithms for random instances of the /^-satisfiability (iiT-SAT) 
problem. We introduce a new stochastic local search algorithm, 
ChainSAT, which moves in the energy landscape of a problem in- 
stance by never going upwards in energy. ChainSAT is a focused 
algorithm in the sense that it considers only variables occurring 
in unsatisfied clauses. We show by extensive numerical investi- 
gations that ChainSAT and other focused algorithms solve large 
-ftT-SAT instances almost surely in linear time, up to high clause- 
to- variable ratios a; for example, for K — A we observe linear-time 
performance well beyond the recently postulated clustering and 
condensation transitions in the solution space. The performance 
of ChainSAT is a surprise given that by design the algorithm gets 
trapped into the first local energy minimum it encounters, yet no 
such minima are encountered. We also study the geometry of the 
solution space as accessed by stochastic local search algorithms. 



1. Introduction 

1.1. Background. Constraint satisfaction problems (CSPs) are the 
industrial, commercial and often very large-scale analogues of popular 
leisure-time pursuits such as the sudoku puzzle. They can be formu- 
lated abstractly in terms of N variables Xi,X2, ■ ■ ■ ,X]sf and M con- 
straints, where each variable Xi takes a value in a finite set and each 
constraint forbids certain combinations of values to the variables. The 
classical example of a worst-case intractable [5] constraint satisfaction 
problem is the K -satisfiability (i^-SAT) problem [7j, where each vari- 
able takes a Boolean value (either or 1) and each constraint is a clause 
over K variables disallowing one out of the 2^ possible combinations 
of values. An instance of ii"-SAT can also be interpreted directly as 
a spin system of statistical physics. Each constraint equals to a K- 
spin interaction in a Hamiltonian, and thus spins represent the original 
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variables; the ground states of the Hamiltonian correspond to the so- 
lutions, that is, assignments of values to the variables that satisfy all 
the clauses (see [H]). 

It was first observed in the context of i^-SAT, and then in the context 
of several other CSPs that ensembles of random CSPs have a "phase 
transition," a sharp change in the likelihood to be solvable [18]. Empir- 
ically, algorithms have been observed to fail or have difficulties in the 
immediate neighbourhood of such phase transition points, a fact which 
has given rise to a large literature [8J. Large unstructured CSPs are 
solved either by general-purpose deterministic methods, of which the 
archetypal example is the Davis-Putnam-Logemann-Loveland (DPLL) 
algorithm [6], or using more tailored algorithms, such as the Survey 
Propagation (SP) algorithm [T7] motivated by spin glass theory, or 
variants of stochastic local search techniques [H HOl |25] . 

Stochastic local search (SLS) methods are competitive on some of the 
largest and least structured problems of interest [11], in particular on 
random KSPC£ instances, which are constructed by selecting indepen- 
dently and uniformly at random M clauses over the variables, where 
the parameter controlling the satisfiability of an instance is a = M/N, 
the ratio of clauses to variables. SLS algorithms work by making suc- 
cessive random changes to a trial configuration (assignment of values to 
the variables) based on information about a local neighbourhood in the 
set of all possible configurations. Their modern history starts with the 
celebrated simulated annealing algorithm of Kirkpatrick, Gelatt and 
Vecchi [12]. From the perspective of ii'-SAT, the next fundamental 
step forward was an algorithm of Papadimitriou [2T] , now often called 
RandomWalkSAT, which introduced the notion of focusing the ran- 
dom moves to rectify broken constraints. RandomWalkSAT has been 
shown, by simulation and theoretical arguments, to solve the paradig- 
matic case of random 3-satisfiability up to about a = 2.7 clauses per 
variable, almost surely in time linear in A^ [11[26]. A subsequent infiu- 
ential development occurred with Selman, Kautz and Cohen's Walk- 
SAT algorithm [23], which mixes focused random and greedy moves 
for better performance. We have previously shown that WalkSAT and 
several other stochastic local search heuristics work almost surely in 
linear time, up to at least a = 4.21 clauses per variable [21 El [23] • 
In comparison, the satisfiability/unsatisfiability threshold of random 
3-satisfiability is believed to be at a = 4.267 clauses per variable [T5] . 

1.2. The present work. The present work carries out a first system- 
atic empirical study of random i^T-SAT for K = 4. Our motivation for 
this study is threefold. 
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Testing the limits of local search. It has been empirically observed 
for = 3 that many SLS algorithms have a linear-time regime, which 
extends to the immediate vicinity of the phase transition point [2l[3l[23]. 
Thus, a similar investigation for higher K is warranted. Here we focus 
on = 4. 

The structure of the space of solutions. Recent rigorous results and 
non-rigorous predictions from spin-glass theory suggest that the struc- 
ture of the space of solutions of a random i^-SAT instance undergoes 
various qualitative changes for K > 4, the implications of which to the 
performance of algorithms should be investigated. 

Mezard, Mora and Zecchina [TH] have shown rigorously that for K > 
8 the space of solutions of random i^'-SAT breaks into multiple clusters 
separated by extensive Hamming distance. (The Hamming distance of 
two Boolean vectors of length is the number positions in which the 
vectors differ divided by A^.) In more precise terms, an instance of 
K-SAT is x-satisfiable if it has a pair of solutions with normalized 
Hamming distance < x < 1. Mezard, Mora and Zecchina [TB] show 
that, for A' > 8, there exists an interval {a,b), < a < b < 1/2, such 
that, with high probability as A^ ^ oo, a random instance ceases to be 
x-satisfiable for all x G (a, b) at a smaller value of a before it ceases to 
be x-satisfiable for some x G [6, 1/2]. 

For A' = 4, we see no evidence of gaps in the empirical x-satisfiability 
spectrum in the linear-time regime of SLS algorithms, which includes 
the predicted spin-glass theoretic clustering points. In light of the rig- 
orous results for A" > 8, this suggests that the cases A' = 4 and A' = 8 
may be qualitatively different. Moreover, we observe that recently pre- 
dicted spin-glass-theoretic clustering thresholds (Krzakala et al. jl3j ) 
have no impact on algorithm performance. This puts forth the ques- 
tion whether the energy landscape of random A'-SAT for small K is in 
some regard more elementary than has been previously believed. 

The structure of the energy landscape. In the context of random K- 
SAT it is common folklore that SLS algorithms appear to benefit from 
circumspect descent in energy, that is, from a very conservative policy 
of lowering the number of clauses not satisfied by the trial configura- 
tion. To explore this issue further, we introduce a new SLS algorithm 
which we call ChainSAT. It is based on three ideas: (1) focusing, (2) 
easing difficult-to-satisfy constraints by so-called chaining moves, and 
(3) never going upwards in energy; that is, the number of unsatisfied 
clauses is a non-increasing function of the sequence of trial configura- 
tions traversed by the algorithm. 

By design, ChainSAT cannot escape from a local minimum of energy 
in the energy landscape. Yet, empirically ChainSAT is able to find a 
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1: S = random assignment of values to the variables 

2: while S is not a solution do 

3: C = a clause not satisfied by S selected uniformly at random 

4: V = Si variable in C selected uniformly at random 

5: AE = change in the number of unsatisfied clauses if V is fiipped in S 

6: if < then 

7: flip VinS 

8: else 

9: with probability t]^^ 

10: flip V in S 

11: end with 

12: end if 

13: end while 

Figure 1. The Focused Metropolis Search algorithm |23j . 

solution, almost surely in linear time, up to values of a reached by SLS 
algorithms that are allowed to go up in energy, such as the Focused 
Metropolis Search P3J. This observation further supports the position 
that random i^'-SAT for small K may be more elementary than has 
been previously believed. 

1.3. Organization of the paper. Section 2 documents our experi- 
ments with the FMS algorithm on random i^'-SAT for K = 4. Section 
3 contains an empirical investigation of x-satisflability in random K- 
SAT for K = 4 using the FMS algorithm. Section 4 introduces the 
ChainSAT algorithm and studies its performance on random i^'-SAT 
for K = 4,5, 6. Section 5 presents a few concluding remarks. 

2. Experiments with Focused Metropolis Search 

The Focused Metropolis Search (FMS) algorithm |23] is given in 
pseudocode in Figured! This section documents our experiments aimed 
at charting the empirical linear-time region of FMS on random K-SAT 
for K = 4. 

2.1. Selecting the temperature parameter. For = 3 it has al- 
ready been established that the FMS algorithm has an "operating win- 
dow" in terms of the adjustable "temperature" parameter t] p3]. For 
too large values of t], the linearity (in A^) is destroyed due to too large 
fluctuations that keep the algorithm from reaching low energies, and 
the solution. For too small values of t], the algorithm becomes "too 
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greedy" leading to a divergence of solution times. Thus, to obtain per- 
formance linear in N, it is necessary to carefully optimize the parameter 

V- 

Figure [2] shows a typical result of the optimization of the temperature 
parameter t] for random K-SAT with K = 4. Two quantities are 
plotted, the fraction of instances solved (within a threshold number of 
flips per variable), and, when all instances are solved, the corresponding 
average solution time. 
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Figure 2. Optimizing the temperature parameter rj for 
the FMS algorithm on instances of random K-SAT at 
K = 4 and a = 9.6. Displayed on the horizontal axis 
is the temperature parameter rj. Plotted on the vertical 
axis is the fraction of 21 random instances solved within 
60000 X N flips at = 100000. In the case all 21 in- 
stances are solved, also plotted is the average solution 
time (in fiips/A^). The optimum is at = 0.293. Note 
the narrowness of the operating window in terms of rj. 



2.2. The empirical linear-time regime of FMS. It is evident from 
Figure [2] that for K = 4 and a = 9.6 the operating window of FMS is 
already very narrow; thus it is striking that the empirical performance 
of FMS is almost surely linear in within the window. 
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Figure 3. Cumulative distributions of solution times 
normalized by the number of variables for the Focused 
Metropolis Search algorithm [23| on instances of random 
/^-satisfiability at K = 4 and a = 9.6. The vertical axis 
indicates the fraction of 1001 random instances solved 
within a given running time, measured in Rips/N on the 
horizontal axis. Inset: Here we present the scaling of 
the algorithm as a increases (with = 100000). The 
"temperature" parameter of FMS is set to ?7 = 0.293. 



In Figure [3] we present empirical evidence that FMS almost surely 
runs in time linear in for instances of random iiT-satisfiability with 
K = 4. The fact that the curves get steeper with increasing N im- 
plies concentration of solution times, or that above-average and below- 
average solution times get rarer with A^. Note that the scaling im- 
plies performance almost surely linear in A^, and demonstrates that 
the linear-time regime of FMS extends beyond the predicted [13j spin- 
glass theoretic "dynamical" and "condensation" transitions points. 
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3. Experiments on x-satisfiability using FMS 

Our experimental setup to investigate x-satisfiability is as follows. 
For given values of a and iV, we first generate a random i^-SAT in- 
stance, and find one reference solution of this instance using FMS. 
Then, using FMS, we search for other solutions in the same instance. 
The initial configuration S for FMS is selected uniformly at random 
from the set of all configurations having a given Hamming distance 
to the reference solution. When FMS finds a solution, we record the 
distance x of the solution found to the reference solution. 

Our experiments on random ii'-SAT for = 4 did not reveal any 
gaps in the x-satisfiability spectrum, even for a = 9.6, beyond the 
predicted spin-glass theoretic "dynamical" and "condensation" transi- 
tions points [13j. In particular. Figure H] gives empirical evidence that 
solutions are found at all distances smaller than the typical distance of 
solutions found by FMS. This is in contrast to the numerical results of 
Battaglia et al. for a balanced version of = 5 ^Sj. 

Here it should be pointed out that the solutions found by stochastic 
local search need not be typical solutions in the space of all solutions: 
there can be other solutions that are not reached by FMS or other 
algorithms. Evidence of this is reflected in the "whiteness" status of 
solutions (see [23], [22], and Section — all the solutions found in our 
experiments were completely white, that is, they do not have locally 
frozen variables. One can of course imagine that a "typical solution" is 
not white, under the circumstances examined here, but as noted there 
is no evidence of the existence of such. 

Figure summarizes the results of a scaling analysis with increasing 
over five random instances and reference solutions. The distance 
distributions appear to converge to some specific curve without ver- 
tical sections, the absence of which suggests that the x-satisfiability 
spectrum has no gaps below the typical distance of solutions found by 
FMS in the limit of infinite A^. 

Figure [6] summarizes the results of a scaling analysis with increasing 
a. We see that the typical distance between solutions found by FMS 
decreases with increasing a, and that no clear gaps are apparent in the 
distance data. 



4. Experiments with ChainSAT 

A new heuristic which never moves up in energy is here shown to 
solve random satisfiability problems almost surely in time linear in 
A^, for a: = 4, 5, 6. 
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Figure 4. Investigation of x-satisfiability using FMS 
initialized with a random configuration at a given Ham- 
ming distance from a reference solution. One reference 
solution and one instance of random i^'-SAT a.t K = 4, 
a = 9.6, and N = 200000. The horizontal axis displays 
the normalized Hamming distance of the initial config- 
uration to the reference solution. The vertical axis dis- 
plays the normalized Hamming distance of the solution 
found to the reference solution. All of the plotted 1601 
searches produced a solution, and no gaps are visible in 
the vertical axis, suggesting asymptotic x-satisfiability 
for X < 0.37. The temperature parameter of FMS is set 
to 7] = 0.293. 

4.1. The ChainSAT algorithm. Our new heuristic, ChainSAT, is 
given in pseudocode in Figure [71 The algorithm (a) never increases the 
energy of the current configuration S; and (b) exercises circumspection 
in decreasing the energy. In particular, moves that decrease the energy 
are taken only sporadically compared with equi-energetic moves and 
chaining moves. The latter are designed to alleviate critically satisfied 
constraints by proceeding in "chains" of variable-clause-variable until 
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Figure 5. Scaling of x-satisfiability data obtained using 
FMS on random KSAT with increasing A^. The param- 
eters K = 4, a = 9.6, and t] = 0.293 are fixed. The 
plotted 10- and 90-percentile curves are calculated from 
five random instances and reference solutions for each 
N = 50000, 100000, 200000, with a moving window size 
of 0.004 in the horizontal axis. The distances appear to 
converge close to the 90%-curves. 



a variable is found which can be fiipped without increase in energy. 
Focusing is employed for the non-chaining moves. The structure of 
ChainSAT has the basic idea of helping to fiip a variable to satisfy an 
original broken constraint. 

The ChainSAT algorithm has two adjustable parameters, one (pi) 
for controlling the rate of descent (by accepting energy-lowering fiips) 
and another (^2) for limiting the length of the chains to avoid looping. 
We omit data related to the optimization of these parameters since the 
procedure is simply an empirical (vary the parameters, check outcome), 
similar to one documented for the FMS algorithm in Figure [2l 
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Figure 6. Scaling of x-satisfiability data obtained using 
FMS on random i^-SAT with increasing a. The param- 
eters K = ^, N = 100000, and r/ = 0.293 are fixed. 
One random instance and one reference solution for each 
a = 8.0, 9.0, 9.45; see Figure S for a = 9.6. The value 
a = 9.45 is between the predicted locations of the dy- 
namical and the condensation transition points [13]. No 
clear gap in distances is discernible in any of the cases. 



4.2. ChainSAT performance. In Figure [8] we present empirical evi- 
dence that ChainSAT almost surely runs in time linear in N for random 
/^-satisfiability problems with K = 4, 5, 6. The fact that the curves get 
steeper with increasing N implies concentration of solution times, or 
that above-average and below-average solution times get rarer with A^. 

Since the algorithm never goes uphill in the energy landscape, local 
energy minima cannot be an obstruction to finding solutions, at least 
in the region of the energy landscape visited by this algorithm. On 
the other hand, when ChainSAT fails to find a solution in linear time, 
this can also result from simply getting lost — in particular, the fraction 
of moves that lower the energy over those that keep it constant may 
dwindle to zero. 
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1: S = random assignment of values to the variables 

2: chaining = FALSE 

3: while S is not a solution do 

4: if not chaining then 

5: C = a clause not satisfied by S selected uniformly at random 

6: V = a variable in C selected u.a.r. 

8: end if 

9: AE = change in the number of unsatisfied clauses if V is flipped in S 

10: chaining = FALSE 

11: ifAE = Othen 

12: flip V mS 

13: else if < 

14: with probability pi 

15: flip V inS 

16: end with 

17: else 

18: with probability 1 — P2 

19: C = a clause satisfied only by V selected u.a.r. 

20: V = a. variable in C other than V selected u.a.r. 

21: V = V' 

22: chaining = TRUE 

23: end with 

24: end if 

25: end while 



Figure 7. The ChainSAT algorithm. 



4.3. Whiteness. To provide a further empirical analysis of ChainSAT, 
we next present Figure [91 This is discussed not in terms of solution 
times and the range of a achieveable with a bit of tuning, but in terms 
of two quantities: (i) the average chain length /chain during the course 
of finding a solution and (ii) the average whiteness depth (AWD). In 
more precise terms, the average chain length is /chain = //^ ^ 1? where 
/ is the total number of iterations of the main loop of ChainSAT and 
m is the number of times the if-statement controlled by the chaining 
flag in the main loop is executed. 

The AWD is related to the result of the so-called whitening proce- 
dure [22], described in pseudocode in Figure [T0| that is applied to the 
solution found when ChainSAT terminates. The whiteness depth of a 
variable is defined as the value of D in the whitening procedure at the 
time the variable gets marked (whitened); the value is infinite if the 
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Figure 8. Cumulative distributions of solution times 
normalized by number of variables N for the CliainSAT 
algorithm on random /^-satisfiability instances at K = 4 
and a = 9.55. The vertical axis indicates the fraction of 
1001 random input instances solved within a given run- 
ning time, measured in flips/A^ on the horizontal axis. 
Inset: Here we present the scaling of the algorithm for 
= 4, 5, 6 at = 100000 with increasing a; the values 
of a{K) in the horizontal axis have been normalized with 
ttsat(A'), which has the empirical values asat(4) = 9.931, 
asat(5) = 21.117, and Q;sat(6) = 43.37 The parame- 
ters of ChainSAT have been chosen to be small enough 
to work at least up to the predicted "dynamical transi- 
tion" [13j: we have set = p2 = 0.0001 (AT = 4), 0.0002 
{K = 5), and 0.0005 [K = 6). 

variable never gets marked (whitened) during the whitening procedure. 
The AWD of a solution is the average of the whiteness depths of the 
variables. See [23] for an empirical discussion of AWD in the context 
of random A'-SAT for A' = 3. The key observation here is that the so- 
lutions found by ChainSAT all have a finite AWD. This in loose terms 
means that there is "slack" in the solution. 
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Figure 9. The average chain length in ChainSAT and 
the average whiteness depth of the solutions found in 
random i^-SAT for K = 4,5, 6. Each plotted value is 
the average over 21 random instances. The values of 
a{K) in the horizontal axis have been normalized with 
<^sa.t{K), which has the empirical values asat(4) = 9.931, 
asat(5) = 21.117, and asat(6) = 43.37 [15]. The Chain- 
SAT parameters are set to Pi = P2 = 0.0001 {K = 4), 
0.0002 {K = 5), and 0.0005 {K = 6). 



Based on Figure M it is clear that increasing the value of a has the 
same effect for K = 4,5, 6: the average chain length /chain increases, 
and so does the AWD. Note that the ratio AWD//chain increases with 
a. 



5. Concluding remarks 

We have here shown empirically that local search heuristics can be 
designed to avoid traps and "freezing" in random ii'-satisfiability, with 
solution times scaling linearly in A^. This requires that circumspection 
is exercised — too greedy a descent causes the studied algorithms to fail 
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1: initially all clauses and variables are unmarked (non-white) 

2: mark (whiten) every clause that is unsatisfied 

3: mark (whiten) every clause that has more than one true literal 

4: D = 

5: repeat 

6: mark (whiten) any unmarked variables that appear as satisfying 

literals only in marked clauses 
7: if all the variables are marked then 

8: declare that S is completely white 

9: halt 
10: end if 

11: if no new variables were marked in this iteration then 
12: declare that S has a core 

13: halt 
14: end if 

15: mark (whiten) any unmarked clauses that contain at least 

one marked variable 
16: D = D + 1 

17: end repeat 



Figure 10. The whitening algorithm for a configuration S. 



for reasons unclear. A physics inspired interpretation is that during a 
run the algorithm has to "equilibrate" on a constant energy surface. 

In terms of the parameter a, it is the pertinent question as to how 
far the "easy" region from which one finds these solutions extends. 
For small K it may be possible that this is true all the way to the 
satisfiability/unsatisfiability transition point. The empirical evidence 
we have here presented points towards a divergence of the prefactor of 
the linear scaling in problem size well below asat- Furthermore, this 
divergence is stronger for higher values of K. For large values of K, 
the absence of traps may however in any case be considered unlikely, as 
the rigorous techniques used to show clustering of solutions for i^' > 8 
[IB] can also be used to show that there exist pairs of distant solutions 
separated by an extensive energy barrier from each other. This suggests 
also the existence of local minima separated by extensive barriers. On 
the other hand, our present results for small K give no evidence in 
this direction. In particular, for = 4 we have shown empirically 
that the energy landscapes can be navigated with simple randomized 
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heuristics beyond all so far predicted transition points, apart from the 
satisfiability /unsatisfiability transition itself. 

Our experiments also strongly suggest that the space of solutions for 
K = 4 at least up to a = 9.6 does not break into multiple clusters 
separated by extensive distance. All the solutions found have "slack" 
in the sense that they have a finite AWD. Is there an efficient way to 
find solutions that are not "white" in this sense; put otherwise, is the 
existence of "white" solutions necessary for "easy" solvability? 

All these observations present further questions about the structure 
of the energy landscape, the solution space, and the workings of al- 
gorithms for random CSPs. They also leave us with challenges and 
constraints to theoretical attempts to understand these, including ap- 
proaches from the physics of spin glasses. 
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